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We discuss a method for determining the optimally-localized set of generalized Wannier functions 
associated with a set of Bloch bands in a crystalline solid. By "generalized Wannier functions" 
we mean a set of localized orthonormal orbitals spanning the same space as the specified set of 
Bloch bands. Although we minimize a functional that represents the total spread ^ ('"'^)n — (r)^ 
of the Wannier functions in real space, our method proceeds directly from the Bloch functions as 
represented on a mesh of k-points, and carries out the minimization in a space of unitary matrices 
Umn describing the rotation among the Bloch bands at each k-point. The method is thus suitable 
for use in connection with conventional electronic-structure codes. The procedure also returns the 
total electric polarization as well as the location of each Wannier center. Sample results for Si, 
GaAs, molecular C2H4 and LiCl will be presented. 
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I. INTRODUCTION 

The study of periodic crystalline solids leads naturally 
to a representation for the electronic ground state in 
terms of extended Bloch orbitals V^nklr), labeled via their 
band n and crystal-momentum k quantum numbers. An 
alternative representation can be derived in terms of lo- 
calized orbitals or Wannier functions w„(r — R), that 
are formally defined via a unitary transformation of the 
Bloch orbitals, and are labeled in real space according to 
the band n and the iattice vector of the unit cell R to 
which they belong .IJq 

The Wannier representation of the electronic problem 
is widely known for its usefulness as a starting point for 
various formal developments, such as the semiclassical 
theory of electron dynamics or the theory of magnetic 
interactions in solids. But until recently, the practical 
importance of Wannier functions in computational elec- 
tronic structure theory has been fairly minimal. How- 
ever, this situation is now beginning to change, in view 
of two recent developments. First, there is a vigorous ef- 
fort underway on the part of many groups to develop so- 
called "order-N" or "linear-scaling" methods, i.e., meth- 
ods for which the computational time for solving for the 
electronic ground state scales only as the first power of 
system size,B instead of the third power typical of conven- 
tional methods based on solving for Bloch states. Many 
of these methods are based on solving directly for local- 
ized Wannier or ffiannier-like orbitals that span the occu- 
pied subspace,U~Hl and thus rely on the localization prop- 
erties of the Wannier functions. Second, a modern theory 
of electric polariZ|aliaa of crystalline insulators has just 
recently emerg ed;liira it can be formulated in terms of 
a geometric phase in the Bloch representation, or equiv- 
alently, in terms of the locations of the Wannier centers. 

The linear-scaling and polarization developments are 
at the heart of the motivation for the present work. How- 



ever, there is another motivation that goes back to a 
theme that has recurred frequently in the chemistry lit- 
erature over the last 40 yeacv-pamely the study of "lo- 
calized molecular orbitals. "EtEj The idea is to carry out, 
for a given molecule or cluster, a unitary transformation 
from the occupied one-particle Hamiltonian eigenstates 
to a set of localized orbitals that correspond more closely 
to the chemical (Lewis) view of molecular bond-orbitals. 
ft seems not to be widely appreciated that these are 
the exact analogues, for finite systems, of the Wannier 
functions defined for infinite periodic systems. Various 
criteria have been -iiitcoduced for defining the localized 
molecular orbitals,c3LJ two of theL.most populat-being 
the maximization of the CoulombE3 or quadratico self- 
interactions of the molecular orbitals. One of the motiva- 
tions for such approaches is the notion that the localized 
molecular orbitals may form the basis for an efficient rep- 
resentation of electronic correlations in many-body ap- 
proaches, and indeed this ought to be equally true in the 
extended, sofid-state case. 

One major reason why the Wannier functions have seen 
little practical use to date in solid-state applications is 
undoubtedly their non-uniqueness. Even in the case of 
a single isolated band, it is well known that the Wan- 
nier functions Wn{v) are not unique, due to a phase inde- 
terminacy e"'^"^'*^^ in the Bloch orbitals V'nkCr). For this 
case, the conditions required to obtain a set of maximally 
localizodp-exponentially decaying Wannier functions are 
known.HJ23 

In the present work we discuss the determination of 
the maximally localized Wannier functions for the case 
of composite bands. Now a stronger indeterminacy is 
present, representable by a free unitary matrix Uml 
among the occupied Bloch orbitals at every wavevector. 
We require the choice of a particular set of Umn according 
to the criterion that the sum f2 of the second moments 
of the corresponding Wannier functions be minimized. 
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(This is the exact analogue of the criteria of Boy^3 for 
the molecular-orbital case.) We show that O can be de- 
composed into a sum of two contributions. The first is 
invariant with respect to the Umn and reflects the k-space 
dispersion of the band projection operator, while the sec- 
ond reflects the extent to which the Wannier functions 
fail to be eigenfunctions of the band-projected position 
operators. We show how this formulation reduces to pre- 
vious ones in the case of a single isolated band, or in one 
dimension, or for centrosymmetric crystals. 

We also describe a numerical algorithm for computing 
the optimally localized Wannier functions on a k-space 
mesh. The algorithm is designed to operate in a post- 
processing mode after a conventional band-structure cal- 
culation, taking as its input the Bloch functions com- 
puted on a mesh of k-points. (Thus, it is not a linear- 
scaling method.) We present sample results for the opti- 
mally localized Wannier functions in Si, GaAs, molecular 
C2H4, and LiCl. It should be emphasized that this pro- 
cedure generates incidentally a set of Wannier-center po- 
sitions; these by themselves can sometimes be very useful 
for analyzing the electronic polarization of disordered or 
distorted insulating materials. 

In this work, we have not considered any further 
generalizations of the problem, although several inter- 
esting possibilities come to mind. For example, one 
could relax the constraint that the Wannier functions 
should be orthonormal to each other (in this case 
they should probably not be called "Wannier func- 
tions"). Such functions would correspond to the "local- 
ized orbitals" or "support fimfvtipns" appearing in cer- 
tain linear-scaling methoda3c3't3 and in the chemical- 
pseudopotential approach .EZI"E3 Alternatively, one could 
retain the orthonormality requirement, but ask to flnd a 
larger set of functions spanning a space containing the 
desired bands as a subspace. For example, in Si one 
could ask for a maximally-localized set of four Wannier- 
like functions per atom spanning a space twice as large 
as, but containing, the space of the four occupied va- 
lence bands.lj Again, this is very similar to what is done 
in certain linear-scaling methodsoHj These interesting 
generalizations deserve investigation, but have not been 
pursued here. 

The manuscript is organized as follows. The problem 
is introduced in Sec. |l[ Expressions for the spread func- 
tional, and for its decomposition into gauge-invariant and 
gauge- depe ndent parts, are developed first in real space 
in Sec. pT[| Section |^ then formulates the correspond- 
ing expressions in discrete k-space (that is, on a mesh 
of wavevectors) . Special features that arise in one di- 
mension, or for a single isolated band, or for a crystal 
with inversion symmetry, are also discussed there, as is 
the steepest-descent minimization algorithm that we use. 
Some discussion and speculation about the asymptotic 
localization properties, and the real vs. complex nature 
of the Wannier functions, appear in Sec. 0. In Sec. 



we present test results for Si, GaAs, C2H4, and LiCl sys- 



tems. Finally, in Sec. VII, we discuss the significance of 
the work, emphasizing possible applications of our ap- 
proach. Some details of the real-space, discrete k-space, 
and continuous k-space formulations are deferred to Ap- 
pendices and [c|, respectively. In particular, the re- 
lationship of the present work to the theory of adiabatic 
quantum phases and quantum distances is discussed in 
Appendix 



II. PRELIMINARIES 
A. Isolated and composite bands 

We confine ourselves here to the case of an inde- 
pendent-particle Hamiltonian H = /2m -\- V{t) with 
a real periodic potential V[r). We thus assume the ab- 
sence of electric and magnetic fields, and we suppress 
spin. The eigenfunctions of H are the Bloch functions 
V'nkCr) labeled by band n and wavevector k. 

A Bloch band is said to be isolated if it does not become 
degenerate with any other band anywhere in the Brillouin 
zone (BZ). Conversely, a group of bands are said to form a 
composite group if they are connected among themselves 
by degeneracies, but are isolated from all lower or higher 
bands. For example, in Si the four valence bands form a 
composite group, while in GaAs the lowest valence band 
is isolated and the higher three form a composite group. 

In the case of isolated bands, it is natural to define 
Wannier functions individually for each band. That is, 
the Wannier function for band n (together with its peri- 
odic images) spans the same space as does the isolated 
Bloch band. In the case of composite bands, however, it 
is more natural to consider a set of J "generalized Wan- 
nier functions" that (together with their periodic images) 
span the same space as the composite set of J Bloch 
bands. That is, the "generalized Bloch functions" V'nk 
that are connected with the n'th generalized Wannier 
function will not necessarily be eigenstates of the Hamil- 
tonian at this k, but will be related to them by a J x J 
unitary transformation. 

The formulation that follows is designed to apply 
equally to the isolated and composite cases. For the 
isolated case, J = 1, and sums over n can be ignored. 
For the composite case, the terms "Bloch function" and 
"Wannier function" should be understood to be meant 
in the generalized sense discussed above. 

It may sometimes be convenient to consider a group of 
bands as composite even when some of the members are 
actually isolated. For example, one may wish to consider 
all of the occupied valence bands of an insulator as a com- 
posite group. This is rather natural in connection with 
linear-scaling algorithms and the theory of electronic po- 
larization. Thus, for GaAs, one may choose to regard all 
four valence bands as a composite group. In this case 
the Wannier functions will resemble a-bonded pairs of 
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sp^ hybrids, arguably the most natural choice. More- 
over, the GaAs Wannier functions defined in this way 
turn out to be considerably more localized than those of 
the top three or bottom valence bands separately. Again, 
the formulation below should be taken to apply equally 
to this case, with n running over the J adjacent bands 
that are being considered as a composite group. 

Finally, the formalism applies equally to any isolated 
band or composite group that may exist in a metal or 
insulator, regardless of occupation. However, because 
the expectation values of physical operators only depend 
upon occupied states, one is usually interested in the case 
of occupied bands in insulators. 



Introducing the notation f„ = (On|r[On) and (r^)„ = 
(On|r^|On) for the diagonal elements in the cell at the 
origin, we have 



V 



(27r)3 



dk (u„k|Vk|w„k) 



and 



V 
W) 



(7) 



(8) 



This last follows from Eq. (||) after an integration by 
parts. 



B. Definitions 



C. Arbitrariness in definition of Wannier functions 



We denote by w„(r — R) or |Rn} the Wannier function 
in cell R associated with band n, given in terms of the 
Bloch functions as 



|Rn} 



V 



(27r)3 



dke-^'^-^lV'nk) 



so that 



|V'«k) =^e^'^-^|Rn) 



(1) 



(2) 



R 



Here V is the real-space primitive cell volume. It is easily 
shown that the Wannier functions form an orthonormal 
set. As usual, the periodic part of the Bloch function is 
defined as 



Mnk(r) = e '''■'"i/'„k(r) 



(3) 



As shown by Blount ,0 matrix elements of the position 
operator between Wannier functions take the form 

(Rn|r|Om) = i j dke*'^-^(ii„k|Vk|u,„k) , (4) 

the converse relation being 

Kk|Vk|M™k) = ■ 



R 



In equations like these the Vk is understood to act to the 
right, i.e., only on the ket. The consistency of these two 
equations is easily checked; the latter can be derived by 
noting that 

(Wnk|Wm,k+b) = (^/'nk|e"'''''"|'0m,k+b) 

= ^e-''"^(Rn|e-''^-'"|Om) , 

R 

and then equating first orders in b. Similarly, equating 
second orders in b leads to 

(Rn|r2|077i) = -^^ J dke''' '^{unu\Vi\u^u) ■ (6) 



As is well known, Wannier functions are not unique. 
For a single isolated band, the freedom in choice of the 
Wannier functions corresponds to the freedom in the 
choice of the phases of the Bloch orbitals as a function of 
wavevector k. Thus, given one set of Bloch orbitals and 
associated Wannier functions, another equally good set 
is obtained from 



|Unk) 



l^nk) 



(9) 



where 



is a real function of k. Such a transforma- 



tion presEpzES the Wannier center r„ modulo a lattice 
vector jB'oIIj but of course it does not preserve the spread 

For a composite set of bands, the corresponding free- 
dom is 



\Unk) 



/ J ^ ran 



Umk) 



(10) 



where Umn is a unitary matrix that mixes the bands at 
wavevector k. Eq. can be regarded as a special case 
of Eq. (|l^) that results when the U are chosen diagonal. 
The transformation ( |lO| ) does not preserve the individ- 
ual Wannier centers, but does preserve jthe sum of the 
Wannier centers, modulo a lattice vector .llJ We shall fre- 
quently refer to this freedom as a "gauge freedom" and 
the transformation ( p^ ) as a "gauge transformation." 

Our goal is to pick out, from among the many arbi- 
trary choices of Wannier functions, the particular set that 
is maximally localized according to some criterion. Our 
choice of criterion is introduced and justified in the fol- 
lowing subsection. Of course, some arbitrariness will re- 
main: (i) there will always be an atbitrary overall phase 
of each of the J Wannier functions ;E3 (ii) there is a free- 
dom to permute the J Wannier functions among them- 
selves; and (iii) there is a freedom to translate any one 
of the J Wannier functions by a lattice vector (that is, 
to decide which Wannier functions belong to the "home" 
unit cell labeled by R = 0). Aside from these trivial re- 
maining degrees of freedom, we expect to find a unique 
set of maximally localized Wannier functions. 
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III. SPREAD FUNCTIONAL IN REAL SPACE 

As a measure of the total delocalization or spread of 
the Wannier functions, we introduce the functional 



n 



(11) 



(recall r„ = (r)„). Eq. (|l l| ) is to be minimized with 

respect to the unitary transformations Uml- A func- 
tional of this -form has previously appeared as one p^ssi- 
ble definitions of the "localized molecular orbitals"t3L£l 
discussed in the chemistry literature. Other localization 
criteria, such as maximizing the sum of Coulomb self- 
energies of the orbitalsEior the the product of the sepa- 
rations of the centroidsEil have also been suggested. We 
focus on the Wannier function obtained by minimizing 
Eq. (^l]) for the following reasons, (i) The Wannier func- 
tions so determined correspond precisely to those con- 
sideierl .by previous authors for the isolated-band case in 
IDdHO and 3D. El (ii) In the ID multiband case, the opti- 
mally localized Wannier functions defined by minimizing 
Eq. ( |ll|) turn out to be identical to the|-eigenfunctions 
of the projected position operator PxP,Lill£3 as will be 
demonstrated shortly. (Here P is the projection opera- 
tor onto the group of bands under consideration. 



(12) 



Rn 



nk 



and Q = 1 — P is the projection operator onto all other 
bands.) (iii) It is one-pf the functionals proposed in the 
chemistry literature.E^ (iv) It leads to a particularly ele- 
gant formalism, allowing, for example, the decomposition 
into invariant, diagonal, and off-diagonal contributions as 
described below. 

We find it convenient to decompose the functional ( p^ ) 
into two terms, 



where 



E 



(r^)„ - ^ |(Rm|r|On) 



R,m 



and 



n 



^ ^ |(RTO|r|On) 



(13) 



(14) 



(15) 



Clearly the second term is positive definite. While it is 
not immediately obvious, the first term is also positive 
definite, and moreover it is gauge-invariant (i.e., inde- 
pendent of the choice of unitary transformations among 
the bands). To see this, we use the definitions of P and 
Q in terms of the Wannier functions to write 



Clj = 'y^^{On\raQra\On) 

net 

= trc [PraQrg] 

= \\PxQ\\l + \\PyQ\\l + \\PzQ\\l 



(16) 



Here trc indicates the trace per unit cell, and — 
trc[A'l'A]. The last form makes it obvious that Vli is 
positive definite. Operators of the. form PyQ have been 
discussed extensively by Nenciu;E3 unlike r itself, PvQ 
commutes with lattice translations, and its expectation 
value is well defined in any (normalizable) extended state. 
Thus, it follows that fij is gauge- invariant (i.e., invari- 
ant with respect to the choice of Wannier functions, or 
equivalently to the choice of the unitary mixing matrices 
Umn)- This will become even clearer in Sec. [V, where fij 



is expressed in a finite-difference k-space representation. 

It was stated earlier that in ID the set of Wannier 
functions that minimizes the spread functional, Eq. (|ll]), 
turns out to be identical to the set of eigenfunctions of 
the projected position operator PxP. This can now be 
seen as follows. Choose the Wannier functions |0m) to be 
eigenfunctions of PxP with associated eigenvalues xom- 
Then 



{Rn\x\{)rii) = {Rn\PxP\Oni) = xq™ 5ii,oS„ 



(17) 



Clearly vanishes, and since Qi is gauge-invariant, this 
minimizes Eq. ([T3|). Thus in ID the solution is essentially 
trivial, even in the multi-band case, and rimin = at 
the solution. 

From this point of view, it can now be understood that 
the essential difficulty in the 3D case is that the opera- 
tors PxP, PyP, and PzP do not commute (or, in the 
language of Appendix that matrices X, Y, and Z do 
not commute.) For if they did, one could choose the 
Wannier functions to be simultaneous eigenfunctions of 
all three, and one could again make Q vanish. But this 
is not generally the case, and the problem is to find a 
set of Wannier functions that makes the best possible 
compromise in the attempt to diagonalize all three si- 
multaneously. Indeed, it appears very natural that the 
criterion should be simply to reduce, as far as possible, 
the mean-square average of all off-diagonal matrix ele- 
ments of X, y, and z between Wannier functions; this is 
precisely the criterion encoded into f2. A procedure for 
carrying out this minimization directly in real space is 
sketched in Appendix |^. However, for crystalline solids 
with periodic boundary conditions, it is more straight- 
forward to work in k-space as discussed in the following 
section. ^ 

Finally, for later reference, it is useful to decompose 
into band-off-diagonal and band-diagonal pieces. 



OD 



(18) 



where 
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f^OD = ^ ^|(Rm|r|On) . (19) 

and 

= ^ ^ |(Rn|r|On) ^ . (20) 



IV. SPREAD FUNCTIONAL IN K-SPACE 

A. Transition to k-space 

We now derive expressions for fl, ili, fi, etc. in terms 
of a discretized k-space mesh. We begin by substituting 
expressions (0) and (^) into Eq. (11), and making use of 



V 



(27r)3 



dk 



-E 



(21) 



where iV is the number of real-space cells in the system, 
or equivalently, the number of k-points in the Brillouin 
zone. Using the finite-difference expressions for Vk and 
introduced in Appendix O, we have 



k,b 



and 



k.b 



- 2 Re (u„k|u«,k+b)] 



(22) 



(23) 



Here b are vectors connecting each k-point to its near 
neighbors and Wh are associated weights (see Appendix 

Clearly, these expressions reduce to Eqs. (^ and (||) 
in the limit of dense mesh spacing {N —> oo, b ^ 0). 
However, we should like to insist on a second desir- 
able property as well: namely, that for a given k-mesh, 
r„ and (r^)„ should transform as expected when the 
definition of \0n) is shifted by a lattice vector. (This 
corresponds to changing the choice of which Wannier 
functions belong to the "home" unit cell.) That is, 
when |M„k) \unk)e^'-^'^, so that ( 
(unk|wn,k+b)e~*^'-^, wc should find 



(24) 



~~> ~\- R , 



so that n will be unchanged. Expressions (|2^) and (|22 
do not obey these requirements, but can be modified to 
do so. As long as the modifications leave the summands 
unchanged to order b and b^ in Eqs. (^2|) and ( |2^ re- 
spectively, they will still reduce to Eqs. (0) and (||) in 
the continuum limit. 
Let 



and, for a given n, k, and b, let 

Mi^^'^^ = l + txb+^yb' + 0{b^) 



(25) 



(26) 



By expanding (u„_k+b|un.k+b} = 1 order by order in 6, 
it is easy to check that x and y are real. Then, referring 
to Eqs. (|2|) and (|3|), we have 



(27) 



M('^'b)-l = zx6-f 0(6^) , 

2 - 2 Re M^^'^) = -yb^ + ^(fe^) 
It is also easy to check that 

ixb = i Im In a4^;'') + 0{b^) , 



yb' = l- \Mi 



k,b)|2 , 27 2 



Oib^) 



Thus, in place of Eq. (22) we write 

^"^-^E^^blmlnMf"^'*') , 



(28) 



(29) 



(30) 



(31) 



k,b 



and, in place of Eq. (p3h. 



1 

N 



E 

k,b 



l-|Mf„^)p 



Im InMl^"^) 



(32) 



When inserted in Eq. (|TT|), this gives our operational def- 
inition of the spread functional f2. 

It is easy to check that Eqs. (^l[)-(^) obey conditions 
( p^ ) exactly, while still reducing to Eqs. (0) and (||) in 
the continuum limit. The expression for the Wannier cen- 
ter, Eq. (^), is strongly reminiscent of the Berry-phase 
expression of Refs. ^ and |l^, and reduces to it for an 
isolated band in ID. [It is also exactly invariant, modulo 
a lattice vector, under any change of phases of the form 
of Eq. (^) , provided that the phases still vary smoothly 
enough with k to prevent ambiguity in the choice of 
branch when evaluating InM,!^'*"^. Of course, it is not 
invariant under an arbitrary gauge transformation, Eq. 
&]■ 



Note that expression (32) for (r^)„ is not unique, even 



when insisting on the invariancc condition (|24|). For ex- 
ample, replacing 



l-|Mt-^)p 



-2RelnM(^''') 



(33) 



results in an equally valid finite-difference formula for Q. 
However, use of the form (^2|) facilitates a connection 
with the decomposition of Q ~ Qi + Qqd + f^D hito in- 
variant, off-diagonal, and diagonal components as in Eqs. 
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( p^ ) and (18). Following the lines of the formalism above, 
one finds that Eq. ( |l4|) becomes 



k,b \ mn / 



(34) 



k,b 



where p(k) ^ |u„k)(u«k|, Q^^'> = 1 - P^^\ and the 
band indices m, n run over 1,..., J. Similarly, Eqs. (|l|) 
and (Bn) become 



(35) 



k,b m^n 



and 



"D = ^E"'&E(-I"il"^^«"''^-b.r„)' . (36) 



k,b n 



n 



OD, 



From these expressions, it is again evident that f2i, 
and are all positive definite. 

Eq. (jsj) also now shows clearly that fli is gauge- 
invariant [i.e., independent of the choice of the Wan- 
nier functions, Eq. (p^j. Heuristically, fli represents 
the degree of dispersion of the band projection opera- 
tor P^^^ through the Brillouin zone. That is, fli is small 
insofar as P'^'^-' is nearly independent of k. (Note that 
tr[PiQ2] = ||Pi -P2IIV2 represents the "spillage," or de- 
gree of mismatch, between the spaces 1 and 2.) Since ili 
is invariant with respect to gauge transformations , it 
can be evaluated once and for all in the initial gaugeji.e., 
using the initial w„k) before performing the minimization 
procedure outlined below. 

■-lae 



It is amusing to note, following the ideas of Refs. 34 



that one can define a "quantum distance" between two 
wavevectors k and k' as dP = tr[P'-^^Q^^ )], thus induc- 
ing a metric upon the k-space. The invariant part of the 
spread functional, fli, turns out to be nothing other than 
the Brillouin-zone average of the trace of this metric. We 
discuss the properties of this metric, and speculate about 
its utility, in Appendix 



B. Gradient of spread functional 

We now consider the first-order change of the spread 
functional Q arising from an infinitesimal gauge transfor- 
mation, Eq. (10), given by 



* * mn. ' 



(37) 



where dW is an infinitesimal antiunitary matrix, dW^ 
—dW, so that 



|u„k) 



\Unk) 



mn Wmk) 



We seek an expression for dn/dW^nl. We use the con- 
vention 



fdF_^ 



dF 



(note the reversal of indices), so that 
dtT [dW B] 



dW 



= B 



dRetr [dW B] 
dW 



dimtr [dW B] 
dW 



A[B] 



= S[B] 



(39) 



(40) 



(41) 



(42) 



where A and S are the superoperators A[B] = {B—B^)/2 
and S[B] = (B + B^)/2i. As we shall see shortly, it is 
possible to cast dQ into the form of the numerators of 
Eqs. (|l|) and (||). 

For the present purpose it is convenient to write f2 — 
OD + f^D J where D,^ is the diagonal part given by Eq. 
(^), and the invariant and off-diagonal parts are com- 
bined into 



= ^E-^E[1-I^^n^''^l 
k,b n 

From Eq. (||) it follows that 



(43) 



(44) 



Using AfC^-'^) = [AfC'+b.-b)]! and dW = -dW'< , the 
second term in Eq. (H) can be transformed to become 
_[(iVF(k+b)^j(k-i-b,-hflj^^^^ Defining 



we thus find 



k,b 



Similarly, defining 

g^'')=ImlnMt''^)+b.r„ 



and 



5(k,b) _ 



^(k,b) 



(k,b) ' 



(45) 



(46) 



(47) 



(48) 



(38) Eq. ( pq ) gives for the diagonal part 
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k,b n 

^^^(k+b)j^(k+b,-b)]^^ . (49) 
the two terms can be 



Substituting qli^^^' 
combined, resulting in 



d^r, = -—J2'^bl-aitT[dW^^^T'-^'^'^] , 



k,b 



where 



^(k,b) ^ n(k,b) (k,b) 

inn mn 



(50) 



(51) 



We thus arrive at the desired expression for the gradient 
of the spread functional, 



^(k) ^ 



(52) 



We note, for completeness, that making the replacement 
( ^ ) has just the effect of replacing i? by i? in the first 
term above. 

The condition for having found a minimum is that the 
above expression should vanish. We discuss the numer- 
ical minimization of the spread functional by steep est 



descents, using this gradient expression, in Sec. IV D 



C. Special cases 

1. One dimension 



As mentioned in Sec. Ill, in ID it should be possible 



to choose the Wannier functions to be eigenfunctions of 
the band-projected position operator PxP, and thus to 
make 51 = JIod + ^^d vanish. Unfortunately, on a finite k- 
mesh D, cannot generally be made to vanish completely. 
At the minimum, does vanish, but iloD does not, 
leaving a remainder that is expected to approach zero as 
0(6^) with mesh spacing b. 

First, note that starting from any given gauge, it is 
straightforward to adjust the phases of the \u„k ) in or- 
der to make 57d = without affecting fioD whatsoever. 
For each n, let A„ — s„/|s„| where s„ = IljLo^ Mnn'^'''^ 
(thus A„ is the "Berry phase" of band n); then, start- 
ing from the first point j = 0, recursively set the phase 
of \un,kj+b) such that m'^u'^^^ — A^^, for successive k- 

points j. Then all the Mnn' will have the same phase 
and will vanish. This operation has no effect whatso- 
ever on the magnitudes of the elements of Mm^^^ , and 
so, by Eq. (^, it leaves fioD unchanged. This argu- 
ment demonstrates that JId — and thus J7 = r^oD at 
the minimum. 



A good starting guess that will make ^od rather small 
(and keep JId = 0) can be constructed as follows. We first 
establish a notion of "parallel transport" of the Bloch 
functions. Starting with some arbitrary choice (from 
among all possible J x J unitary rotations) of the \unko) 
at an initial k-point fcoj we choose the |wn,fco+6) ^-t the 
next point ko + b hy insisting that Mmn'^'''' should be 
hermitian. [This choice is uniquely given by the singu- 
lar value decomposition M = VHW"^ , where V and W 
are unitary and S is a diagonal matrix with nonnega- 
tive diagonal elements. Then M = {VT.V''){VW''); and 
by appropriate unitary rotation, the VW'' term can be 
eliminated, leaving M hermitian.] This procedure is re- 
peated, progressing from k-point to k-point [and using 
Un-7r/a{x) = u^^^^ / a{x) 6^'"^^ ^ whcu crossiug the Bril- 
louin zone boundary] until the loop is completed, estab- 
lishing a new set of states at fcp that are related to the 
initial ones by a unitary transformation A. |(This matrix 
A is the generalization of the Berry. rahas^ZI to a non- 
Abelian multi-dimensional manifold £3'l3c2I) Next, one 
diagonalizes A = VXV^, and rotates the bands at every 
k-point by the same unitary matrix V. Having done this, 
one finds that each state \unko) is carried onto itself by 
parallel transport around the loop, except that it returns 
with an excess phase A„. Finally, defining 7„ — Xn^^^ 
and modifying the phases as \unkj) ^ ji, [""nfej), we ar- 
rive at the desired solution (parallel-transport gauge). 

At this solution, each Mmii^^^ = Kml^n with K her- 

(k -\-b) 

mitian. It follows that the Im In Mnn are independent 
of k, the Wannier centers Xn are determined by the A„, 
and thus JId of Eq. ( p6| ) vanishes. From Eq. (35) it can 
be seen that iloD does not generally vanish. However, 

(k) 

OoD depends only on the matrices Kmn, and these can 
be shown to scale as Smn + 0(6^), so that \Mmn'''^\'^ is 
expected to scale as 0(6^), and fioD as 0(6^). 

If a minimization of Q is then carried out starting from 
this parallel-transport solution, one expects to remain 
zero and fioD to be reduced slightly, the reduction again 
being expected to be 0(6^). The Wannier centers will 
also presumably shift slightly. 

If one is mainly interested in the Wannier centers in 
the ID case, it may be preferable to take these from 
the parallel-transport solution (i.e., from the A„), rather 
than from the Xn at the minimum. The former ap- 
proach cop^spfjinds more closely with the Berry-phase 
viewpoint JlrEJlij and indeed the sum of the Wannier 
centers so defined correspo.nds .to the usual formula for 
the electronic polarization.EJO (Actually, for this pur- 
pose, the full parallel-transport construction need not be 
carried out. A may be calculated as the product of the 
unitary parts of the M matrices in any given representa- 
tion, and the A„ obtained as its eigenvalues. By "unitary 
part" we mean the VW'' taken from the singular value 
decomposition M = VTiW''' .) On the other hand, the 
parallel-transport formulation does not easily generalize 
to higher dimensions. Thus, the approach of minimizing 
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the Q functional appears to be the most natural one in 
higher dimensions, and it gives results that differ only 
very slightly from the parallel-transport solution for rea- 
sonable meshes in ID. 



2. Isolated band in multiple dimensions 

For the case of an isolated band in multiple dimensions, 
the problem of finding the optimally localized Wannier 
function maps onto the peabJem of solving the Laplace 
equation for a phase field,H'Ell as described next. floD is 
not present, and the problem reduces to minimizing Od, 
so that only the second term in Eq. (p2[) appears. Clearly 



R is identically one and T^^'^'' 
Eq. (p^) becomes 



^(k,b) jg leol^ so that 



GC') =4i^wb Im In M^^'^'' 

b 



(53) 



At the solution, this expression must vanish. Starting 
from some initial guess on the phases of the |itk) and 
making the substitution of Eq. (^, it can be seen that 
Eq. ( p3| ) corresponds to a solution of the Laplace equation 
for the phase field </>(!<;). This corresponds closely to the 
discussion in the vicinity of Eq. (5.15) of Ref. ^ 

The quantity — bim InM'^'''''^ is a finite- 

difference representation of the vector field A(k) = 
«(uk|Vk|uk); in the language of the theory of geometrical 
phases, A(k)_ifLknQwn as the "gauge potential" or "Berry 
connection. "E30'E3 The average value of A(k) is gauge- 
invariant -(modulo a quantum) and is set by the Berry 
phase J1J~Ej but A(k) is locally gauge-dependent. The 
minimization of Q via the solution of the Laplace equa- 
tion selects the gauge that makes V ■ A vanish, but its 
curl, B = V X A, is generally non-zero. In fact, B, which 
is known as the "Berry curvature," is a gauge-invariant 
quantity; M. |*an be regarded as an intrinsic property of 
the band.ya 

Since A(k) is periodic in fc-space, one can alternatively 
think in terms of the Fourier coefficients A(R). These 
can be divided into three contributions: the uniform part, 
A(R = 0); and, for R 7^ 0, the longitudinal and trans- 
verse parts ylL(R) and At(R), i.e., the components of 
A(R) parallel and perpendicular to R, respectively. The 
uniform part gives the Wannier center; the longitudinal 
part is the part that can be made to vanish by appro- 
priate choice of gauge; and the transverse part is gauge- 
invariant (it is related to the Berry curvature) and deter- 
mines the minimum value of Qd. In fact, the individual 
Fourier components A(R) can be related to the matrix 
elements (R|r|0) of Eq. (|l5|); it thus follows that at the 
solution, the latter are purely transverse, A(R) • R = 0. 
Unfortunately, the picture does not appear to remain so 
simple in the multiband discussed in Appendix 

n 

The Berry curvature, or equivalently, the transverse 
part of the Berry connection, can easily be shown to van- 



ish for an isola ted ban d in a crystal with inversion sym- 
metry (see Sec. [ V C 3| ) ; in this case the solution for A(k) 
is a perfectly uniform one, and Qu vanishes at the solu- 
tion. In a non-centrosymmctric crystal, however, this is 
not the case, since a non-zero Berry curvature is generally 
present. This provides a complementary viewpoint, for 
the single-band case, on the fact that the non-invariant 
part D, of the spread functional cannot generally be made 
to vanish. 



3. Inversion symmetry 

When inversion symmetry V{r) = V{—r) is present, 
the cell-periodic Bloch functions can be chosen to be 
real in the reciprocal representation; that is, ■(i„k(r) — 
X^G '^nk(G) exp(iG • r) with M„k(G) real. It might 

naively appear that all the Mmn''^ matrices could then 
be chosen real, and that the solution of the minimiza- 
tion problem might be trivial in some sense. This is not 
quite true. Even for an isolated band, there is the com- 
plication that the Berry phase of the band may be — 1 
instead of -1-1; in this case the M„k(G) can be chosen real 
locally (i.e., in a small neighborhood around any given 
k), but not globally. But this really only means that 
the corresponding Wannier function has definite symme- 
try under inversion through a symmetry center ( "Wyck- 
off position" ) other than the one at the origin, and the 
Berry phase can be reset to +1 by a shift of origin. For 
the case of composite bands, however, the problem is to 
choose a particular gauge transformation [Eq. (p^j , not 
just a phase transformation [Eq. (||)], and for this the 
presence of inversion symmetry does not provide any ob- 
vious solution. 

For example, consider the case of the four valence 
bands of Si. (Numerical results for this case appear in 
Sec. VIA.) Taking the origin at the center of the bond 



oriented along [111], it turns out to be possible to choose 
one of the Wannier functions to have inversion symme- 
try about the origin, while the other three have inversion 
symmetry about other Wyckoff positions (those corre- 
sponding to the other three bond centers), and the re- 
maining Wyckoff positions (tetrahedcal and octahedral 
interstitial positions) are unoccupiedfl This would have 
been hard to guess based on symmetry alone (although 
it is natural from a chemical point of view). Because 
each Wannier function does have its own inversion sym- 
metry, it turns out that ilu does vanish for Si. However, 
^OD 7^ 0. The contribution to Qqd from a given pair 
{mn} of Wannier functions is related to the matrix ele- 
ments (Rm|r|On). These matrix elements can be shown 
to vanish if, in addition to obeying inversion symmetry 
individually, the two Wannier functions are translational 
images of one another; but this is certainly not gener- 
ally the case. (In the language of Appendix |c[ the fact 
that OoD 7^ for Si is related to the fact that the Berry 
curvature tensor does not vanish for this system.) 
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Finally, in some cases it might be possible to choose all 
the Wannier functions to have definite symmetry under 
inversion, but the solution that minimizes H may spon- 
taneously break the inversion symmetry. Some cases of 



this sort are discussed in Sees. VIC and VI D below. 



4- Molecular supercells and single k-point sampling 



In the context of plane-wave pseudopotential and 
related approaches, it is common to study molecules 
or clusters j-ia an artificial periodic superlattice 
arrangement. e3 In such a case, a single k-point (usually 
ko — r) sampling of the Brillouin zone suffices for con- 
ventional quantities such as energies, forces, and charge 
densities, since the errors in these quantities will be expo- 
nentially small as long as the overlap between wavefunc- 
tions in neighboring supercells is negligible. However, 
under the same conditions, the calculation of using our 
approach introduces small errors that nevertheless scale 
only as L~^, where L is the supercell dimension (see, 
e.g.. Sec. VIC). The problem essentially arises from the 
use of the simplest finite-difference representation of Vk, 
involving only nearest-neighbor k-points (see Appendix 
^). If higher accuracy is need, this problem can be over- 
come in either of two ways: (i) by using the solution at 
kg to construct solutions on a denser mesh of k-points, 
Mk(r) — Uko(r) exp[i(ko — k) • r], being sure to take the 
discontinuity of (kg — k) • r near the supercell boundary 
where Wko(r) is negligible; or (ii), construct periodic func- 
tions a;(r), y(r), ^(r) such that x — x,y — y,z = zm 
the molecular region, with (possibly smoothed) disconti- 
nuities at the supercell boundaries, and then apply the 
theory of Appendix]^ to the matrices X, Y, Z computed 
as Xmn = (wmko |5;|u„ko}i ^tc. Approach (i) is a "quick 
fix" requiring very little reprogramming, while approach 
(ii) is preferable in principle. 

It is also common practice to use single k-point sam- 
pling for supercell calculations on extended systems, pro- 
vided that the supercell is sufficiently large in all three 
dimensions. In such cases, our procedure can again be ap- 
plied, but it should be kept in mind that the convergence 
of n with supercell size should be expected to be slower 
than the convergence of total energies and forces. More- 
over, the electronic polarization that would be computed 
from the sum of our Wannier centers is not guaranteed to 
be exactly identical to the one-that would be computed 
from the Berry-phase formula£Sc3 



-2e 

'V 
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hnko) ■ (54) 



used in recent molecular-dynamics simulations of infrared 
absorption spectraJlj However, the two should be very 
close, and should become identical in the limit of large 
supercell size. 



D. Steepest-descent minimization 

1. Algorithm 

In order to minimize the spread functional Q by steep- 
est descents, we make small updates to the unitary ma- 
trices, as in Eq. (|37|), choosing 



(55) 



where e is a positive infinitesimal. We then have, to first 
order in e. 



k 



(k)||2 



(56) 



where = Smn l^mnP we have made use of 

G'' — — G. Thus, use of Eq. ( |5^ ) is guaranteed to make 
dn < 0, i.e., to reduce f2. 

In practice, we take a fixed finite step with e = a /Aw, 
where w = J^b'^b, so that 



AM/(k) ^^Ynj, fyl[i?('^-'')] -5[T('^'*')] 



(57) 



The wavefunctions are then updated according to the 
matrix exp[AW^'^'')], which is unitary because ISW is an- 
tihermitian. The choice of prefactor above is designed 
so that in the single-band case, and for simple k-meshes 
(e.g., simple cubic), the "highest-frequency mode" asso- 
ciated with phase rotations is just marginally stable with 
the choice a = 1. That is, if one starts with the true so- 
lution and rotates the phases of the wavefunctions on all 
k-points simultaneously by an angle ±7, with the oppo- 
site sense of rotation on nearest-neighbor k-points, then 
from Eq. ( ^7| ) Aq^^'^^ — ±27 on every link, and the above 
choice of AW exactly returns the system to the solution 
if a = 1/2, and is marginally unstable at a = 1. We 
find that a = 1 is still a safe choice for all the systems 
studied; more efficient strategies, if needed, can also be 
implemented straightforwardly, using, e.g., a conjugate- 
gradient approach in composing subsequent descent di- 
rections, and by choosing at each step the optimal a for 
the line minimization. 

It should be noted that the evolution towards the min- 
imum requires only the relatively inexpensive updating 
of the unitary matrices, and not of the wavefunctions, as 
follows. We choose a reference set of Bloch orbitals 1^1^"^) 
and compute once and for all the inner-product matrices 



7V/(0)(k,b) 



\"mkl"n,k+b/ 



(58) 



We then represent the |w„k) (and thus, indirectly, the 
Wannier functions) in terms of the |u^k) ^^'^ ^ 
unitary matrices Uml, 



9 



\Unu)=Y.U^n\u^l) ■ (59) 

m 

We begin with all the Umn initialized to (5,„„. Then, each 
step of the steepest-descent procedure involves calculat- 
ing AW from Eq. (|57|), updating the unitary matrices 
according to 

[/(k) ^ [/(k) exp[AM^(K)] , (60) 

and then computing a new set of M matrices according 
to 

^(k,b) ^ ^(k)t^(0)(k.b)[;(k+b) _ (-g-^^ 

The cycle is then repeated until convergence is obtained. 
Note that the exponential in Eq. ( |60| ) is a matrix oper- 
ation, which we perform by transforming to a diagonal 
representation of AW and back again. 

Typically, we prepare a set of reference Bloch orbitals 
'"ik) by projecting from a set of initial trial orbitals 
g,i{r) corresponding to some rough initial guess at the 
Wannier functions. For example, for these gn(r) we have 
used Gaussian functions centered at or near mid-bond 
positions. The initialization procedure involves first pro- 
jecting onto Bloch states of the set of bands at wavevector 
k, 

|0rik) = X! l^'"k)(V'mk|5n) • (62) 
m 

Since these are not orthonormal, we then perform a sym- 
metric orthonormalization to form a set of 

|0«k) = 51(^"'^')™"l'/'™k) (63) 

m 

(where S',„„ — (0mk|</'nk)), and finally convert to cell- 
periodic functions via 

«ikW=e-'^-""?nk(r) . (64) 

(In practice, the above steps are combined.) This pro- 
cedure is similar in prineiolc to the one introduced by 
Satpathy and Pawlowska,EJ although it differs in that we 
do the orthonormalization in k-space. We then use this 
set of reference Bloch orbitals as a starting point for the 
steepest-descent procedure. In practice, we find that this 
starting guess is usually quite good, as will be shown for 
the cases of Si and GaAs in Sec. y</n. 



2. False local minima 



random phase rotation to each u^^ individually, or a ran- 
dom J X J unitary rotation to the set of u^^^ at each 
k. With such starting guesses, we find that while the 
steepest-descent procedure sometimes does lead to the 
desired global minimum, it often can get stuck in local 
minima. That is, we find that the spread functional fJ, 

fk) 

viewed as a function of the set of UmAj does typically 
have false local minima that must be avoided. 

We find that this problem is not associated with the 
presence of a large number of bands, and in particular it 
never arose in the case of F-only sampling in orthorhom- 
bic supercells (say, even for a disordered 64-atom cell 
of Si). Instead, it tends to be associated with finer k- 
point meshes, regardless of whether one is treating sin- 
gle or multiple bands. In the case of multiple bands, 
some Wannier functions at the local minimum have an 
almost normal behavior, with reasonable spread, while 
other functions are abnormal, with spreads an order of 
magnitude larger than expected. Moreover, we found all 
the false minima to be characterized by Wannier func- 
tions that are genuinely complex (for the true global min- 
imum we always found the Wannier functions to be real, 
apart from a trivial overall phase). The Wannier func- 
tions associated with false local minima usually display 
erratic and unphysical oscillations. 

The problem appears to lie in the possibility of mak- 
ing inconsistent choices in the branch cuts when evalu- 
ating the logarithms of complex argument in (p7|). In a 
naive implementation, the branch cuts are simply cho- 
sen so that Igi*''''''! < TT. At a good global minimum, 
all of the <^ it, while at a false local minimum 

some of the approach tt. We find that the system 

can be excited out of these local minima by an itera- 
tive process of switching the branch cut by 27r for the 
few qn 's with the largest magnitudes, followed by further 
steepest-descent minimization, and that such a process 
almost always leads fairly quickly into the basin of at- 
traction of the global minimum. 

Moreover, we have never observed the system to be- 
come trapped in a false local minimum when starting 
from reasonable trial projection functions, Eqs. (|6^|64|). 
If physically motivated trial functions are not available, 
we find that another effective heuristic approach is to 
use as trial projection functions Gaussians that are either 
centered at random locations, or on arbitrary meshes in 
the unit cell. 

In summary, while false local minima can occur in our 
minimization scheme, they do not seem to pose a problem 
in practice. 



We have also carried out tests in which we initialize 
the steepest-descent procedure with more arbitrary start- 
ing guesses. For example, we have let the starting 
consist of energy-ordered Hamiltonian eigenstates with 
quasi-random phases, as in the typical output of a band- 
structure code. We have also tried applying a completely 



V. PROPERTIES OF OPTIMALLY-LOCALIZED 
WANNIER FUNCTIONS 
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A. Asymptotic localization properties 

Following from the early work of Kohn,i it is generally 
expected that Wannier functions can be chosen to have 
exponential localization. While it is not the purpose of 
the present work to study questions of exponential decay 
in the tails of the Wannier functions, we nevertheless give 
a brief discussion of these issues here. 

KohnH proved the existence of exponentially localized 
Wannier functions for the case of an isolated band in 
ID, for a crystal with inversion symmetry. However, the 
method does not easily generalize. Blount demonstrated 
the analytieity of the Bloch functions for the single-band 
case in 3DJ3 and claimed (end of Sec. 5 of Ref. ^ that 
this would imply the exponential localization of the Wan- 
nier functions (see also Ref. ^l|); but this claim was later 
shown to be faulty by Nenciu (footnote on first page of 
Ref. ^), who pointed out the global topological aspects 
of the problem. Des Cloizeaux proved the exponential lo- 
calization of the band projection operator Pf^ Eq. (|l^ ) 
for an arbitrary set of composite bands in 3D.c3 Unfortu- 
nately, this does not immediately imply that the Wannier 
functions are exponentially localized (although the con- 
verse would follow). In a following paper, des Cloizeaux 
was able to prove the possibility of choosing exponentially 
localized Wannier functions for an isolated band (i)-4n ID 
generally, or (ii) in the centrosymmetric 3D case.Ell The 
summary (Sec. V) of Ref. 47 gives a good discussion of the 
difficulties and partial progress towards a solution of the 
general composite-band problem. More recently, Nenciu 
completed a proof for tho-Gase of an isolated band in 3D 
without centrosymmetry.E3 To our knowledge, however, 
the problem remains unsolved for the general case of com- 
posite bands in 3D. Finally, note that some discussion of 
the exponential localization of the "generalized Wannier 
functions" defined for the cases of surfaces and defects 
has been given in Refs. p6| , ^8| -[5^. 

It is natural to speculate that the "optimally localized" 
Wannier functions that are obtained by minimizing the 
spread functional of Eq. (^l]) are exponentially localized. 
Actually, one should distinguish between a "weak con- 
jecture" that the optimally localized Wannier functions 
have exponential decay, and a "strong conjecture" that 
they have the same exponential decay as that of the band 
projection operator P. At the present time, we can only 
speculate that in 3D, the weak conjecture, at least, will 
hold. 

In ID, we are on firmer footing. As shown in Sees. 
Ill and [V C_l , the functions that are obtained by mini- 
mizing Eq. (|ll[) correspond, in principle, with those con- 
sidered by previous authors, and fpjj-. whic h exponential 
localization has been demonstrated.Hu'EllO In particular. 



we have shown in Sec. Ill that these will be eigenfunc- 
tions of the band-projected position operator PxP; Niu 
has given a simple and elegant argument, based on this 
fact alone, from which one may conclude thai the Wan- 
nier functions decay faster than any power. l3 From this 



point of view, the essential difficulty in 3D is that the 
Wannier functions can no longer generally be chosen to 
be eigenfunctions of all three band-projected position op- 
erators simultaneously. 

Returning to the general 3D case, we find that it is 
not easy to carry out numerical tests of exponential lo- 
calization using the present method, which is based on 
discretization in k space. The Wannier functions that we 
obtain are thus not truly localized, being instead artifi- 
cially periodic with a periodicity inversely proportional 
to the mesh spacing. 



B. Conjecture: optimally localized Wannier 
functions are real 

It seems not to be widely appreciated that the Wannier 
functions Wn{r) can always be chosen real. This depends 
only on the Hamiltonian H = /2m -\- V{y) being her- 
mitian, and not on any symmetry of the (real) potential 
V{y). Indeed, from Eq. (|l|) it is clear that one only needs 
to choose 



(r) 



.-k(r) 



(65) 



to insure that the Wannier functions Wn{v) are real. This 
condition is automatically satisfied if one starts with ini- 
tial Wannier functions projected from real trial functions, 
as discussed at the end of the previous subsection; alter- 
natively, it can be imposed by hand. From Eq. ([2^), 

condition ( |65| ) implies that Mmn''^ = Mmi^'' '^■'*, which 
in turn implies Gmn = Gmn^* , so that Eq. ( |65| ) con- 
tinues to be satisfied during the steepest-descent update 
procedure. In this way, one will eventually arrive at a 
set of maximally localized real Wannier functions. (Sim- 
ilarly, working in real space, it is easy to see from Ap- 
pendix |A| that a real initial guess will result in a set of 
real optimally-localized solutions.) 

We conjecture that a stronger result is true: namely, 
that the optimally localized Wannier functions are always 
real (apart from a trivial overall phase of each Wannier 
function). That is, we conjecture that the minimum that 
is arrived at subject to the constraint (^5|) is actually the 
global minimum. 

We have not found a proof of this conjecture, but it is 
supported by our empirical experience. More precisely, in 
the tests to be reported in Sec. VI, we find that whenever 
we arrive at the global minimum, the Wannier functions 
always turn out to be real, apart from a trivial overall 
phase. (However, we do find that the Wannier functions 
are typically complex at false local minima, as discussed 
in Sec. |IVD2| .) 



VI. RESULTS 
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A. Si 

For Si, the four occupied valence bands have to be 
taken together as a single composite group, because of 
degeneracies between the bottom two bands at X, and 
between the top three bands at F. Thus, we take J — 4 
and look for a set of four Wannier functions per prim- 
itive unit cell. These are expected to be centered on 
the bond centers, and to have roughly the character 
of CT-bond orbitals, i.e., even linear combinations of the 
two sp^ hybrids projecting toward the bond center from 
the two neighboring atoms .□ Wannier functions of this 
type havp iibceB. computed previously by a variety of 
methods. oEil l3 It is tempting to imagine that the re- 
quirement of spanning the given set of valence bands, 
together with the symmetry requirement that each Wan- 
nier function has the expected inversion, mirror, and 
three-fold rotational symmetries about its corresponding 
bond center, might be enough to uniquely determine the 
Wannier functions. We emphasize that this is not the 
case, and we proceed to determine the particular set of 
Wannier functions that minimize the spread functional 

n. 

Our calculations 
are carried out within the local-densitjz-approximation to 
Kohn-Sham density-functional theory,E3 using a standard 
plane-wave pseudopotential apptoach and an all-bands 
conjugate-gradient minimization.ta We have used norm- 
conserving pseudopotential£3 in the Kleinman-Bylander 
representation, with plane-wave cutoffs ranging from 200 
eV to 650 eV, depending on the systems studied. The 
sampling of the Brillouin-zone. is performed with equis- 
paced Monkhorst-Pack gridsE3 that have been offset in 
order to include F. Since the crystal is fee in real space, 
the grid is bee in k-space, and we use the simplest possi- 
ble finite-difference representation of Vk using only the 
Z=8 nearest neighbors of each k-point (see Appendix^). 
The computed Bloch functions are stored to disk, and the 
construction of the Wannier functions is carried out as a 
separate, post-processing operation. 

Table || shows the convergence of the spread functional 
and its various contributions as a function of the density 
of the k-point mesh used. We confirm that iln does van- 
ish (to machine precision) as expected from the pr esence 
of inversion symmetry, as discussed in Sec. [VC3 . Since 
fii is invariant, the minimization of f2 reduces to the min- 
imization of r^oD- For each k-point set, the minimization 
was initialized by starting with trial Gaussians of width 
(standard deviation) 1 A located at the bond centers. We 
find that for the case of crystalline Si, these provide an 
excellent starting guess; for the 8x8x8 case, for ex- 
ample, we find an initial r2D=0 and rioD=0.565, whereas 
at the minimum fioD is 0.520. Had we started with the 
random phases provided by the ab-initio code, we would 
have obtained an initial ilD=622.1 and iloD=42.3. We 
find that typically 20 iterations are needed to converge to 
the minimum with good accuracy, starting with the ini- 



TABLE I. Minimized localization functional Q in Si, and 
its decomposition into invariant, off-diagonal, and diagonal 
parts, for different k-point meshes (see text). Units are A^. 



k set 


n 




f2oD 




1x1x1 


2.024 


1.999 


0.025 





2x2x2 


4.108 


3.707 


0.401 





4x4x4 


6.447 


5.870 


0.577 





6x6x6 


7.611 


7.048 


0.563 





8x8x8 


8.192 


7.671 


0.520 






tial choice of phases given by the Gaussians, and using 
a simple fixed-step steepest-descent procedure. Starting 
with a set of randomized phases requires roughly one or- 
der of magnitude more iterations, and adds the possibil- 
ity that the system may get trapped for a while in some 
local minimum. As previously pointed out, the evolu- 
tion does not require additional scalar products between 
Bloch orbitals, and so it is in any case pretty fast. Be- 
cause of symmetry, the Wannier centers do not move dur- 
ing the minimization procedure, and the spreads of the 
four Wannier functions remain identical with each other. 

What is perhaps most striking about Tabic | is that 
ill ^ ^oW, and while Q converges fairly slowly with k- 
point density, this poor convergence is almost entirely 
due to the Qi contribution. Incidentally, since the f2i 
contribution is gauge- invariant, it can be calculated once 
and for all at the starting configuration, for any given k- 
point set; the quantities that are actually minimized are 
r^D and f2oD- The former vanishes at the minimum, and 
the latter is found to converge quite rapidly with k-point 
sampling. It would be interesting to explore whether use 
of a higher-order finite-difference representation of Vk 
might improve this convergence, especially that of Qi, 
but we have not investigated this possibility. 

In Fig. |l], we present plots showing one of these 
maximally-localized Wannier functions in Si, for the 
8x8x8 k-point sampling. The other three are iden- 
tical (related to the first by the tetrahedral symmetry 
operations) and are located on the other three tetrahe- 
dral bonds. Each displays inversion symmetry about its 
own bond center, and it is real apart from an overall com- 
plex phase. Again, all these properties are not trivial, and 
would not be satisfied by a generic choice of phases. (Our 
initial guess based on Gaussians centered in the middle 
of the bonds does insure all these properties, but without 
optimizing the localization.) 

From an inspection of the contour plot it becomes 
readily apparent that the Wannier functions are essen- 
tially confined to the first unit cell, with very small (and 
decreasing) components in further- neighbor shells. The 
general shape corresponds to a chemically intuitive view 
of sp^ hybrids overlapping along the Si-Si bond to form 
a a bond-orbital, with the smaller lobes of negative am- 
plitude clearly visible in the back-bond regions. These 
results clearly illustrate how the Wannier functions can 
provide useful intuitive understanding about the forma- 
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(a) 




[111] Axis 



(b) 




FIG. 1. Maximally-localized Wannier function in Si, for the 
8x8x8 k-point sampling, (a) Profile along the Si-Si bond, 
(b) Contour plot in the (110) plane of the bond chains. The 
other Wannier functions lie on the other three tetrahedral 
bonds and are related by tetrahedral symmetries to the one 
shown. 



TABLE II. Minimized localization functional 57 in GaAs, 
and its decomposition into invariant, off-diagonal, and diag- 
onal parts, for different k-point meshes, together with the 
relative position (5 of the centers along the Ga-As bond (see 
text). Units for the f2's are A^. 



k set 


Q. 








/3 


1x1x1 


2.217 


2.088 


0.125 


0.0035 


0.593 


2x2x2 


4.409 


3.898 


0.503 


0.0078 


0.602 


4x4x4 


6.785 


6.170 


0.610 


0.0055 


0.613 


6x6x6 


7.982 


7.386 


0.590 


0.0058 


0.616 


8x8x8 


8.599 


8.038 


0.555 


0.0059 


0.617 


12 x 12 X 12 


9.146 


8.635 


0.504 


0.0061 


0.617 



tion of chemical bonds. 



B. GaAs 

In GaAs the lower valence band is never degenerate 
with the other (top) three valence bands, and thus sev- 
eral possibilities arise: (a) We can treat the four bands 
as a group, as was done for silicon, obtaining solutions 
that are very similar to the Si case, except for the loss 
of inversion symmetry about the bond centers, (b) We 
can deal separately with the bottom band and the top 
three bands; the latter would be considered as a group, 
while the former is a single isolated band. The solution 
at the minimum should resemble atomic orbitals for the 
more electronegative species (the As anion), in the form 
of three p orbitals and one s orbital respectively, (c) Fi- 
nally, it might be interesting to consider the case in which 
the four bands are treated together, but using the solu- 
tion of the Q. minimization for the 1-band and 3-band 
cases, without proceeding further with the minimization. 
This does not correspond to a true minimum for the 4- 
band 17 surface, but just to a stationary (saddle) point. 

Starting with the case in which all the four bands are 
treated as a group, we show in Table |l| the convergence 
of the spread functional and its various contributions as 
a function of the density of the k-point sampling. In 
analogy with the case of Si, the procedure is initialized 
using trial Gaussians of width 1 A, centered in the mid- 
dle of the bonds; this is again a very good starting guess, 
and (for the 8x8x8 mesh), gives an initial J7d=0.1164 
and rioD=0.593, that are reduced to 0.0059 and 0.555 
respectively by the minimization procedure. As it was 
the case for Si, k-point convergence is fairly slow, even 
though most of it is due to the slow convergence of the 
invariant part. On the other hand, the general shape of 
the Wannier functions at the minimum is already given 
rather accurately with coarser samplings (although the 
tails are then not so easy to characterize, since in prac- 
tice the Wannier functions are periodically repeated in a 
supercell conjugate to the k-point mesh). In particular, 
the k-point convergence of the Wannier centers is quite 
rapid, as is evident from the last column of Table ||, 



13 



TABLE III. Localization functional fl and its decompo- 
sition in invariant, off-diagonal, and diagonal parts, for the 
case of GaAs (units are A^). The bottom valence band, the 
top three valence bands, and all four bands are separately in- 
cluded in the minimization. The star (*) refers to the case 
in which the minimization is not actually performed, and the 
solution for the 1-band and 3-band cases is used. Sampling is 
performed with a 8 x 8 x 8 mesh of k-points. 



k set 


n 


Qi 






1 band 


1.968 


1.944 





0.0238 


3 bands 


10.428 


9.844 


0.560 


0.0245 


4 bands* 


12.396 


8.038 


4.309 


0.0483 


4 bands 


8.599 


8.038 


0.555 


0.0059 



where we show the relative position of the centers along 
the Ga-As bonds. Here (3 is the distance between the Ga 
atom and the Wannier center, given as a fraction of the 
bond length (in Si the centers were fixed by symmetry to 
be in the middle of the bond, (3 = 0.5, irrespective of the 
sampling) . 

In Fig. H, we present plots showing one of these 
maximally- localized Wannier functions in GaAs, for the 
8x8x8 k-point sampling. Again, at the minimum 17, 
all four Wannier functions have become identical (under 
the symmetry operations of the tetrahedral group) , and 
they are real, except for an overall complex phase. The 
shape of the Wannier functions is again that of sp"^ hy- 
brids combining to form cr-bond orbitals; inversion sym- 
metry is now lost, but the overall shape is otherwise 
closely similar to what was found in Si. The Wannier 
centers are still found along the bonds, but they have 
moved towards the As, at a position that is 0.617 times 
the Ga-As bond distance. It should be noted that these 
Wannier functions are also very similar to the localized, 
orbitals that are found in linear-scaling approaches,Ej 
where orthonormality, although not imposed, becomes 
exactly enforced in the limit of an increasingly large lo- 
calization region. This example highlights the connec- 
tions between the two approaches. The characterization 
of the maximally-localized Wannier functions indicates 
the typical localization of the orbitals that can be ex- 
pected in the linear-scaling approach. Moreover, such in- 
formation ought to be extremely valuable in constructing 
an intelligent initial guess at the solution of the electronic 
structure problem in the case of complex or disordered 
systems. 

As pointed out before, in GaAs we can have different 
choices for the Hilbert spaces that can be considered, so 
we also studied the case in which only the bottom band, 
or the top three, are used as an input for the the min- 
imization procedure. Table HI shows the spread func- 



tional and its various contributions for these different 
choices, where the bottom band is first treated as iso- 
lated; next the three p bands are treated as a separate 
group; then these two solutions are used to construct a 
four-band solution, without further minimization; and fi- 



(a) 




[111] Axis 



(b) 




FIG. 2. Maximally-localized Wannier function in GaAs, for 
the 8x8x8 k-point sampling, (a) Profile along the Ga-As 
bond, (b) Contour plot in the (110) plane of the bond chains. 
The other Wannier functions lie on the other three tetrahedral 
bonds and are related by tetrahedral symmetries to the one 
shown. 
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FIG. 3. Contour plot, in (110) plane, of the maxi- 
mally-localized Wannier function in GaAs for the 8x8x8 
k-point sampling when only the bottom valence band is con- 
sidered. 



nally, this is compared with the full four-band minimiza- 
tion. In composing the results for the 1-band and 3-band 
cases, we take the 1x1 and 3x3 unitary matrices that 
would give the minimum solution for the 1- and 3-band 
cases, and build from them a set of 4 x 4 block-diagonal 
unitary matrices. The 4-band that is obtained is ex- 
actly the sum of the two initial il's. Nevertheless, the 
bookkeeping changes: is reduced, with an equal and 
opposite contribution reappearing in flou- (The fljn's 
sum up exactly, as they must.) If we then minimize this 
(saddle-point) solution, we recover the 4-band minimum: 
the invariant part (obviously) does not change, while f^D 
increases to permit a larger reduction in f2oD, in corre- 
spondence to an increased interband mixing. 

In Fig. ^, we show the contour plot for the maximally- 
localized one-band Wannier function in GaAs, for the 
8x8x8 k-point sampling. The function is again real, 
and it shows the typical characteristics of an s orbital 
centered around the anion; the tetrahedral symmetry of 
the lattice deforms the spherical orbital, introducing con- 
tributions that point along the two bond-chains (one in 
the (110) plane plotted, and one perpendicular to that 
plane). In the three-band case, on the other hand, the 
Wannier functions resemble three orthogonal atomic p or- 
bitals. It should be stressed that only when all the four 
bands are treated simultaneously do we achieve the over- 
all maximum localization. This reinforces the picture in 
which the maximally localized orbitals correspond to the 
most natural "chemical bonds" in the system. 



C. Molecular C2H4 

We have also studied the case of the ethylene molecule 
(C2II4), in order to make the connection with some stan- 
dard chemistry concepts, and to highlight the relation of 
our formalism (derived from a k-space representation of 
extended Bloch orbitals ) to th e case of an isolated sys- 
tem as discussed in Sec. |IVC4 . First of all, the molecule 
is modeled in periodic boundary conditions, in a super- 
cell that is large enough to make the interaction with 
the periodic images negligible. Consequently, the band 
dispersion becomes also negligible, and F sampling is 
all that is needed for total energies, forces, and densi- 
ties. However, the spread functional is expected to con- 
verge sl ightly s lower with k-point sampling, as discussed 



in Sec. IV C 4. We thus tested several k-point meshes. 



For the single k-point case, the mesh in reciprocal space 
is that formed by the F point and all its periodical im- 
ages, i.e., the reciprocal lattice vectors; our formalism 
remains equally applicable to such a case. One should 
bear in mind that if the supercell is not cubic, appropri- 
ate weight factors have to be added in the calculation of 
the derivatives (see Appendix |^) . 

We show in Table |^ the coordinates for the C and 
H atoms at the structural minimum, together with the 
Wannier centers. We have used the local-densitv_approx- 
imation for the exchange-correlation functional.Ej In this 
molecule, there are six occupied valence eigenstates, the 
lowest five being of C-H or C-C cr-bonding character, 
and the top (frontier) orbital being of C-C 7r-bonding 
character. If we treat the lowest five bonds as a compos- 
ite group, we find as expected that the minimization of 
51 leads to cr-bond-orbitals located on each of the C-H or 
C-C bonds. However, treating all six bands together, we 
find that the C-C 7r-bonding orbital mixes strongly with 
the C-C cr-bonding orbital to give two Wannier func- 
tions that are symmetrically disposed above and below 
the x — y plane. Contour plots for the resulting C-H and 
C=C Wannier functions are shown in Fig. H, and the lo- 
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TABLE V. The functional fl and its decomposition, with 



increasing 


k-point S3,iiipliiii 


5, for 6th.ylGii6 


(units are 


j. 


k set 


n 








1x1x1 


4.041 


3.657 


0.384 





2x2x2 


4.503 


4.124 


0.380 


6x 10"^ 


3x3x3 


4.600 


4.222 


0.377 


3x 10~^ 



cations of the Wannier centers are reported in Table IV . 
The picture that emerges from this "natural" symmetry 
breaking of the planar geometry is just the Lewis picture 
of the C=C double bond. 

In our calculations we have used a cubic supercell of 
side 7 A] this gives to each band a dispersion that is al- 
ways smaller than 0.02 eV, and that originates from the 
interaction with the superperiodic images. Increasing the 
k-point sampling has negligible effects on the equilibrium 
positions of the C and H atoms and on the location of 
the Wannier centers. But it does still affect the localiza- 
tion functional, which displays a slower convergence with 
respect to the number of k-points used (although much 
faster than was the case for Si or GaAs). The results 
are summarized in Table where we show the fl con- 
tributions for the maximally-localized Wannier functions 
with increasing k-point sampling. It is readily seen that 
the slow convergence is coming mostly from the invari- 
ant part of the functional; a finer k-point mesh provides 
both a more detailed sampling of the Brillouin Zone and 
a more accurate calculation of the gradients. 



(a) 



(b) 



D. LiCl 

It is also interesting to look at a more ionic system, to 
understand the effect of electronegativity and band gap 
on the location and localization of the Wannier functions. 
We have studied rocksalt LiCl, treating all four valence 
bands (roughly CI 3s and 3p) as a unit, and again using 
an 8 X 8 X 8 k-point sampling. 

One could expect the Wannier functions to localize 
much more strongly around the anion than was the case 
for GaAs, and indeed this is what we find. However, we 
also find that the Wannier functions can reduce ^l further 
by mixing to form sp^ hybrids, sitting on the vertices of 
a tetrahedron centered around the CI atom, with each 
center at a distance of 0.449 A from the CI (the Li-Cl 
distance being 2.57 A). We anticipated that these hy- 
brids might prefer to align along the {111, 111, 111, 111} 
or {111, 111, 111, 111} sets of directions; if this were the 
case, the choice between the two sets (two degenerate 
global minima of f2) would constitute a kind of unphys- 
ical or "anomalous" symmetry breaking from cubic to 
tetrahedral. Instead, we find that O is, at least to our 
machine precision, rotationally invariant with respect to 
the orientation of the sp^ hybrids, just as would be the 
case for an isolated Cl^ ion in free space. This implies 
that the tetrahedron of the Wannier centers around each 




FIG. 4. Contour plots for the maximally-localized Wannier 
functions in ethylene, C2H4. (a) One of the four C-H Wannier 
functions, shown in the x — y plane, (b) One of the two C=C 
Wannier functions, shown in a; — 2: plane. 
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CI atom is free to rotate without any discernible decrease 
of localization. 

Finally, consistent with the idea that a larger gap 
is linked to a higher degree of localization, we find 
a total f]=4.159 A^, with r2i=3.354, f]oD=0.805 and 

VII. DISCUSSION 

We have discussed a technique for obtaining a set of 
well-localized Wannier functions for a given band or com- 
posite set of bands in a crystalline solid. We have in mind 
several kinds of applications for this method. 

First, we believe that this approach may help to obtain 
chemical intuition about the nature of chemical bonds in 
solids, and to characterize trends in bonding properties 
within classes of solids. As emphasized in the introduc- 
tion, the Wannier functions defined here are the natu- 
ral geneijalization of the concept of "localized molecular 
orbitals"t3L£l to the case of solids. As illustrated in the 
examples of GaAs and ethylene (C2H4) above, the de- 
termination of the Wannier functions can give chemical 
intuition into the nature of the bond orbitals of the mate- 
rial, including the spontaneous symmetry breaking that 
occurs in the Lewis picture of a double or triple bond. 
We also suspect that it may be instructive to generate, 
characterize and plot the Wannier functions across a se- 
ries of compounds, e.g., for II- VI semiconductors as one 
varies from wide- to narrow-gap members, or in cubic 
perovskites of varying compositipn. Moreover, as em- 
phasized by Hierse and Stechel,E£l the Wannier functions 
may be transferable to a considerable degree for similar 
bonds in different chemical systems (for example, for C-H 
or C-C bonds in a variety of hydrocarbons) . It should be 
noted, however, that this is even more likely to be true 
for non-orthogonal Wannier-like functions,tj as opposed 
to the orthogonal ones studied here. 

Second, it is possible that the Wannier functions may 
prove suitable as a basis for use in constructing theo- 
ries of interacting or strongly-correlated electron systems. 
For example, it might be possible to build good approx- 
imate correlated wavefunctions from sums of Slater de- 
terminants of the Wannier functions. For this purpose, 
one would clearly need to choose a set of bands that 
includes some low-lying unoccupied states of the one- 
particle mean-field Hamiltonian. Similarly, it might be 
possible to build accurate model Hamiltonians for for 
magnetic systems, or for transport properties of metals. 
(Again, for metals it would appear necessary to choose a 
composite group of bands that brackets the Fermi level, 
and to specify the occupation as a kind of density matrix 
in the Wannier indices.) 

Third, the present scheme might prove useful for pre- 
dicting the suitability of linear-scaling methods for differ- 
ent kinds,of insulating materials. Since the linear-scaling 
methodsO depend strongly on the localization properties 



of the Wannier functions (or, closely related, the den- 
sity matrix), the present scheme might be a simple and 
useful way to characterize the degree of localization for 
a given target material. This information might then 
help predict whether the material is a good candidate for 
a linear-scaling method; and if so, what type of linear- 
scaling method is likely to work best, and what real-space 
cutoff parameter is likely to be required. 

Finally, an important feature of the present approach 
is that it generates a list of the locations of the Wannier 
centers. This information alone can often be of crucial 
importance. In fact, we envisage a number of interest- 
ing applications in which one essentially throws away all 
other information about the Wannier functions, keeping 
only their locations. For example, the shift of the Wan- 
nier center away from the bond center might serve as a 
kind of measure of bond ionicity. Also, the vector sum 
of the Wannier centers immediately gives the bulk elec- 
tronic polarization P; all three Cartesian components of 
P can thus be determined simultaneously using a con- 
ventional k-mesh, instead of constructing separate spe- 
cial k-point strings to compute each separate Cartesian 
component of P as is needed otherwise.^ 

But more importantly, the information on the locations 
of the Wannier functions may open the possibility of cal- 
culating properties that cannot otherwise be obtained, 
especially for distorted, defective, or disordered systems. 
For example, it becomes possible not only to calculate 
the Born (dynamical) effective charge Z*, but also to de- 
compose it into displacements of individual neighboring 
Wannier centers. To illustrate this idea, we have car- 
ried out a calculation on a cubic supercell of GaAs con- 
taining 64 atoms (F-only k-point sampling), in which all 
atoms are in their equilibrium positions except for one Ga 
atom that is displaced by 0.1 A along the [111] direction. 
Observing the consequent displacement of the Wannier 
centers from their bulk crystalline positions, we find a to- 
tal Zq^ of 2.04, in good agreement with the established 
theoretical, value of 1.99 as calculated by linear-response 
methods.E2l Moreover, in arriving at the total electronic 
Zq^'=-0.96, we find contributions of -1.91, -1-0.65, and 
-1-0.30 from the groups of four first-neighbor, 12 second- 
neighbor, and remaining further-neighbor Wannier cen- 
ters, respectively. It is interesting to note that inclusion 
of nearest-neighbor contributions alone would thus sig- 
nificantly overestimate the magnitude of Zq°^, and that 
the second-neighbor Wannier centers move in the oppo- 
site direction to the Ga atom motion. If we repeat the 
calculation displacing one As atom,|--we obtain a total 
of -2.07 (the acoustic sum ruleEJ is only approxi- 
mately satisfied with a finite k-point sampling). The to- 
tal electronic Z'^f=-7.07 has now contributions of -1.74, 
-4.63, and -0.71 from the groups of four first-neighbor, 12 
second-neighbor, and remaining further-neighbor Wan- 
nier centers, respectively. 

In fact, the pattern of displacements of the Wannier 
centers can be regarded as defining a kind of coarse- 



17 



grained representation of the polarization field, P(r). To 
illustrate this idea more directly, we have carried out a 
calculation for bulk GaAs in which a long-wavelength 
transverse optical (TO) phonon has been frozen in. We 
take the wavevector q = [tt / Aa){x+y) {a is the lattice pa- 
rameter) and relative displacements ^(r) = sin(q • r)z 
in a 16-atom supercell, composed of 8 unit cells repeated 
in the (110) direction. We assign a displacement ampli- 
tude ^0 = O.Ola to the Ga sublattice, and — Mqi,/Ma_s 
to the As sublattice (Mca and Mas are the masses of 
the two species; the center of mass doesn't move). Ob- 
serving the resulting displacements of the Wannier cen- 
ters, we can obtain a picture on how the local polar- 
ization changes from cell to cell (say, by summing all 
the 4 Wannier centers surrounding one As atom); fitting 
these to the same form P(r) — Pq sin(q • r)z, we obtain a 
Po=0.249, and, via the acoustic sum rule {Z^ 



), we get Z*f 



-1.52 and Z*f = -6.48. These re- 
sults are only in fair agreement with the bulk values; the 
discrepancies might be due to the finite size of our super- 
cell, or to not having used the proper eigenvector for the 
phonon mode considered. However, the main point of 
this demonstration is that, given the calculation on the 
supercell containing the frozen TO phonon, there is no 
other way that the transverse component of the polariza- 
tion field could have been obtained. Since the mode is 
transverse, P(r) cannot be determined from the charge 
density; since q 7^ 0, the Berry-phase approach does not 
apply; and since the displacement is finite, the linear- 
response approach is not directly applicable. However, 
the present scheme allows a direct finite-difference calcu- 
lation of the transverse polarization field, a quantity that 
was previously unavailable. 

It would be interesting to apply this kind of analysis 
to supercell simulations of amorphous systems such as 
a-H2 or a-GaAs. Once again, while only the longitu- 
dinal part of P(r) can be determined from the charge 
density, a similar determination of both the longitudi- 
nal and transverse components is possible with access to 
the displacements of the Wannier centers, thus leading 
to a more complete theory of the dielectric properties 
of such systems. This information might be used to as- 
sist the approach of Ref. in which the infrared ab- 
sorption spectrum of an amorphous system is extracted 
from a molecular-dynamics simulation. As a limited test, 
we have carried out calculations for a 64-atom supercell 
of crystalline Si with random displacements typical of 
~1000K, and find that the calculation of the displaced 
Wannier centers is straightforward. 

Finally, we conclude by pointing out that our work 
opens numerous possibilities for further development and 
future study. On a practical level, it might be use- 
ful to explore the use of more accurate, higher-order fi- 
nite difference formulas for Vk (see Appendix ^) to see 
whether convergence with respect to k-point sampling 
can be improved. It might be interesting to apply our 
analysis within the semi-empirical tight-binding context. 



Ga 
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although it should be noted that matrix elements of x, 
y, and z (and, for fij, also of r^) would be needed, in ad- 
dition to the Hamiltonian and overlap matrix elements. 
Going beyond the scope of the present work, it might 
be interesting to other explore localization criteria e.g., 
the maximization of the Coulomb self-interaction of the 
Wannier functions. It would also be of great interest 
to develop a corresponding theory of maximally-localized 
non- orthogonal Wannier-like functions. (While the direct 
connection to the polarization properties would be lost, 
there would be important implications for some linear- 
scaling algorithms.) Finally, there are many questions 
of a mathematical character that deserve further study. 
For example, is it possible to prove that our Wannier 
functions (those that minimize Q) have exponential de- 
cay, even in the general non-centrosymmetric multi- band 
case? Are they always real, as conjectured in Sec. VB ? 
And are there further results that can be derived re- 
garding the interrelations between the metric tensor, the 
Berry connection, and the Berry curvature, as discussed 
in Appendix ^ We hope that our work will stimulate 
some investigations of these questions. 
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APPENDIX A: MINIMIZATION OF SPREAD 
FUNCTIONAL IN REAL SPACE 



In Sec. in above, the problem of finding the opti- 
mally localized Wannier functions for a periodic system 
was formulated directly in real space. In this Appendix, 
we briefly reformulate the problem for the case of a fi- 
nite system (cluster, molecule, etc.), and sketch how the 
minimization of the functional can be performed in this 
case. This provides a complementary perspective to the 
k-space procedure discussed in the main text. 

We change notation |Rn) \i) and now refer to the i 
as "localized orbitals" rather than "Wannier functions," 
but their meaning is the same: they are a set of orthonor- 
mal orbitals spanning the Hamiltonian eigenstates in an 
energy range of interest (e.g., for the occupied valence 
states of a molecule or cluster). 



Following the approach of Sec. [II, we decompose fl — 



J2i[v)i~''^i] iiifo an invariant part fli = J2a [P'l^aQ'r'a] 
(where P ~ and Q = 1 — P) and a remain- 

der n ^ HaHii.] Defining matrices = 

(i|a;|j), A^D.ij — Xij Sij, X' = X ~ Xd, and similarly for 
Y and Z, this can be rewritten 
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n = tT [x'^ + Y'' 



(Al) 



Thus if X, Y, and Z could be simultaneously diagonal- 
ized, then Q could be minimized to zero, but for non- 
commuting matrices this is not possible. In a sense, our 
job is to perform the optimal approximate simultaneous 
co-diagonalization of the three Hermitian matrices X, 
Y, and Z by a single unitary transformation. We are 
not aware of a formal solution for this problem, but a 
steepest-descent numerical solution is fairly straightfor- 
ward. Since tr [X'Xd] = 0, etc.. 



= 2 tr [X'dX + Y'dY + Z'dZ] 



(A2) 



We consider an infinitesimal unitary transformation 
\i) Wjilj) (where dVF is antihermitian), from 

which dX = [X,dW], etc. Inserting in Eq. (A2) and us- 
ing tr [A[B,C]] = tr [C[A,B]] and [X',X] ^ [X',Xd], 
we obtain d^l — tr [dW G] where 



G = 2 



{[X',Xu] + [Y',Yu] + [Z',Zu]} 



(A3) 



so that the desired gradient is dV,/dW — G a.s given 
above. The minimization can then be carried out using 
steepest descents following the general approach outlined 
in Sec. IV D| . More sophisticated but related methods are 
discussed in Ref. 

If this approach is applied to a finite system having 
a crystalline interior, the solutions in the interior are 
expected to correspond precisely with the maximally- 
localized Wannier functions as determined using the k- 
space methods of the main text. In the vicinity of sur- 
faces or defects, or for disordered materials, the solutions 
will essentially correspond to the "generalizficl-Wannier 
functions" discussed by previous authors E3'Ea~E3 



APPENDIX B: FINITE-DIFFERENCE 
FORMULAS FOR k-SPACE GRIDS 

We assume that the Brillouin zone has been discretized 
into a uniform Monkhorst-Pack mesh.ES Let b be a vector 
connecting a k-point to one of its near neighbors, and let 
Z be the number of such neighbors to be included in the 
finite-difference formulas. We seek the simplest possible 
finite-difference formula for Vk, i.e., the one involving 
the smallest possible Z. When the Bravais lattice point 
group is cubic, it will only be necessary to include the first 
shell oi Z = 6, 8, or 12 k-neighbors for simple cubic, bcc, 
or fee k-space meshes, respectively. Otherwise, further 
shells must be included until it is possible to satisfy the 
condition 



Wb babp = 6af3 



(Bl) 



by an appropriate choice of a weight Wb associated with 
each shell |b| = b. For the three kinds of cubic mesh, Eq. 



(Bl) is satisfied with Wb = 3/Zb'^ (single shell). Tak- 
ing next the slightly more complicated case of an or- 
thorhombic lattice, one can let b run over the two near- 
est neighbors in each Cartesian direction (Z = 6), with 
Wb = 1/26^ for the two neighbors at ztb^x, etc. Even 
in the worst case of minimal (triclinic) symmetry, only 
six pairs of neighbors {Z = 12) should be needed, as the 
freedom to choose six weights should allow one to satisfy 
the six independent conditions comprising Eq. ( p3lD . 

Now, if /(k) is a smooth function of k, its gradient can 
be expressed as 



V/(k) =^^bb[/(k + b)-/(k)] 



(B2) 



We can check the correctness of this finite-difference 
formula by applying it to the case of a linear func- 
tion /(k) = /o -I- g • k, for which we find VQ/(k) = 
J^h'^bY^gbagpbp ^ ga- In a similar way, 



|V/(k)r=^^, [/(k + b)-/(k)]^ 



(B3) 



We note that improved accuracy and k-set convergence 
might be obtained by utilizing improved, higher-order 
finite-difference formulas involving more shells of neigh- 
boring k-points, but we have not explored this possibility 
here. 



APPENDIX C: GEOMETRIC PROPERTIES AND 
COMPLEXITY OF ELECTRON BANDS 

Consider a manifold of J orthonormal states |'0n(A)), 
n = 1,..., J, depending on a continuous d-dimensional 
parameter A. Alternatively, one can view these as rep- 
resenting the projection P(A) — J2n IV'n(A))('0n(A)|. For 
the application to electron bands in crystals, we identify 
A — > k and tpnW itnk- Here, we investigate the ge- 
ometric properties of such a manifold, generalizing the 
single-state (J — 1) results of Refs. p4|-p6| to the multi- 
state case. 

One can define two kinds of intrinsic geometric prop- 
erties: a geometric distance and a geometric phase. We 
consider the former first. The geometric distance D12 
between two points Ai and A2 is here taken to be 



D 



tr[PiQ2 



\Pi 



(CI) 



where Q{\) = 1 — P{^)- In the case of a single state, this 
becomes = 1 — |(V'i|'02)P, which for small separations 
is consistent with the slightly different definition Dfj — 
2 — 2|(-0i|'02)| of Ref. Considering the distance for 
infinitesimal separations, one can define a Riemannian 
metricEil 



Dl,\+dx - X! ^'^^ ^'^^ 



(C2) 



a/3 
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Introducing the notation 'il'n,a = dip„/dXa, etc., and 
making use of 



= {lp„\lpm,a) + {^n,a\'<Pm) 



= (ipn\'4'm,af}) + (lpn,a(3\'<Pm) + 2 Rc (^n,a\'4'm,p) 



(C3) 



(C4) 



which follow from the fact that the ipn remain orthonor- 
mal at first and second order in dX, the metric gap be- 
comes, after some manipulation, 

gai3 = Re ^ {lpn,a\lpn,p) " ^ {ipn.allpm) (V'mlV'n,/?) , 
n mn 

(C5) 

which ijeduces in the single-band case to the expression 
of Patikj 

From Eq. (CI) it is obvious that the distance, and 
thus the metric, are gauge-invariant quantities. These are 
therefore intrinsic properties of the manifold. One way 
of thinking about the metric is to observe that for any 
given path in A space, the line integral of g^^^ along the 
path provides a measure of the total "quantum distance" 
along the path; intuitively, it is a measure of the amount 
of change of character of the states as one traverses the 
path. The physical meaning of this distance for the case 
of temporal evolution of quantum states is discussed in 
Refs. ||-|6|. 

The second type of geometric object that caH-|be de- 
fined is a "geometric phase" or "Berry phase." E3 Here, 
one is interested in considering closed paths in A space, 
and relating the phase (or, for the multi-state case, the 
unitary rotation) induced by adiabatic ( "parallel" ) trans- 
port along the path. The multi-state ( "nottiAbelianJli 
case has beea discussed by Wilczek and ZeeEEl, Mead,E£l 
and RestaJlj One can define a (non-gauge-invariant) 
Berry connection 

Aa,nm = i (V'nlV'm.a) (C6) 

and a (gauge-covariant) Berry curvature 

(C7) 

The invariants of the latter, such as 

trB„^ = 2Im^(V;„,a|Vn,/3) , (C8) 

n 

[see Eq. (3.29) of Ref. [l^] are thus gauge-invariant. (We 
shall use the notation 'tr' and 'Tr' to denote electronic 
and Cartesian traces, respectively.) 

There i s a tantalizing similarity between the metric 
gap , Eq. (|C5|), and the quantum trace of the Berry cur- 
vature, Eq. (|C8|). In fact, defining the gauge-invariant 
quantity 



^al3 = ^ (V'«,q|Q|V'«,/3) 



(C9) 



where again Q — 1 — P , and using Eq. (|C3|) to show 
that the second term in Eq. ( |C5| ) is intrinsically real, 
we obtain simply gap = ReJ-Q,^ and tr Bap = 2Im!Fap- 
This suggests that there mEOi-.be, some deep connections 
between the two quantities.LflO In the case where the 
states ipn are_^eigenstates of a Hamiltonian H{X), one 
moreover hasEj 



J-ap 



J oo 

E E 

n=l m=J+l 



{^n\Ha\^rn){'lpm\Hp\lJjn) 



(CIO) 



where Ha = dH{X)/dXa- 

We now return to the case of electron bands in crys- 
tals, A ^ k and V-'n(A) u„k, and discuss the geometric 
properties induced by the band projection operator P^^\ 
Note that g, A, and B have units of P, I, and P, respec- 
tively. Again focusing first on the metri c, an d comparing 
Eq. (H) with the definitions (^) and (^, we find 



^ E E ^"/S bp 



(Cll) 



k,b a/3 



or, using Eq. (Bl) and restoring the continumn limit. 



ni = 



V 



(27r)3 



dkTr.g(k) 



(C12) 
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where the integral is over the Brillouin zone. Thus, the 
invariant part of the spread functional is nothing other 
than the Brillouin-zone average of the trace of the metric! 

It may be interesting to see whether other global prop- 
erties of the metric might be given some physical inter- 
pretation. In particular, we define a dimensionless and 
gauge-invariant quantity 



C = 



rfk det^/2 g(k) 



(C13) 



BZ 



We shall call this the "complexity" of the bands. A/Iath- 
ematically, it is really nothing other than the volume of 
the Brillouin zone as measured according to the metric 
g. However, we have called it the "complexity" because 
it measures the variation of the character of the band 
projection operator P^^^ throughout the Brillouin zone. 
Everything said here applies to any isolated band or com- 
posite group of bands, but we have in mind primarily the 
case where all the occupied valence bands in an insulator 
are considered as a composite group. In this case, and as- 
suming that one is only interested in quantities (such as 
total energies and forces) that can be expressed as a trace 
over the bands, the complexity might thus be expected to 
refiect (and even predict) the number of k-points needed 
for an accurate sampling of the Brillouin zone. We have 
not tested this idea numerically, but this would clearly 
be an interesting avenue for future exploration. 



20 



Turning now to phase properties, we note that a finite- 
different representation of the Berry connection is 



(CM) 



curvature will vanish if and only if the band-projected 
position operators PxP, PyP, and PzP commute with 
one another; as discussed following Eq. (17), this is also 
just the condition that Q vanishes at the minimum. 



Restoring the continuum limit in /c-space, we can write 
V 



(27r)3 



dkA„„(k) 



(C15) 



BZ 



and more generally, 

(Om|r|Rn) = ( dk A„„(k) e^"^'^ . (C16) 



The right-hand side is just Amn(R.)j_the Fourier coeffi- 
cient of the Berry curvature. Eq. (C15) is just the ex- 
pression for the position of the Wannier[-| pe j| it [y^ which 
contributes to the electronic polarization.BEJila'tS More- 
over, 



V 



(27r)3 



dk I A„„(k) - r„ 



BZ 



OD 



y ^ 



dk I A™„(k) 



(C17) 



(C18) 
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Eqs. (C17-C18) show that the non-invariant parts of the 
spread functional are also conveniently written in terms 
of the Berry connection. If the above equations are reex- 
pressed in terms of the Fourier coefficients Am„(R), Eqs. 
( p^ ) and (|2^) are immediately recovered. 



In the single-band case, we showed in Sec. p^V C 2| that 
the minimum value of could be related to the transverse 
part of the Berry connection, which in turn is determined 
by the gauge-invariant Berry curvature. In the multiband 
case, the Berry curvature i3™(k) is no longer gauge- 
invariant, and it is not obvious whether it is possible to 
make a corresponding decomposition. Nevertheless, one 
can derive similar correspondences as those above for A. 
So, 



i?^,"(k) = 



,,/3) +«(Mm,/3|(31 



(C19) 



B™^"(R) = -i ( «™ I r^rn - rpQv^ \u^) . (C20) 
Making use of raQrp — rpQva = [PvaP, PrpP], one finds 

II [Pr^P,PrpP] \\l = J2J2 I' 

R mn 

= / dk \\B^0(k)f . (C21) 

Each form above is manifestly gauge-invariant and 
positive-definite. Thus, it can be seen that the Berry 
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